Working Set Requirements and Performance of Network Caches in Cluster-Based Multiprocessors
نویسندگان
چکیده
This paper evaluates network caching as a means to improve the performance of cluster-based multiprocessors. A network cache, shared by all processors on each cluster, o ers the potential bene ts of increased intra-cluster sharing, reduced network tra c, and useful prefetching. Using simulation, we evaluate the feasibility, structure, and performance of a network cache implementation. Five well-known parallel scienti c applications are used in this study. We experimentally derive the network cache working sets of these applications, and demonstrate that the size requirements of the network cache are feasible using current technology. Using cache sizes derived from this working set analysis, we compare a conservative network cache implementation to that of an aggressive directory-based scheme without a network cache. In all ve applications, the inclusion of the network cache improves performance. Finally, we examine the e ect on application performance of varying the network cache line size and the number of processors per cluster.
منابع مشابه
Techniques for Reducing the Impact of Inclusion in Shared Network Cache Multiprocessors Techniques for Reducing the Impact of Inclusion in Shared Network Cache Multiprocessors
This paper investigates design alternatives for shared network caches in clusterbased multiprocessors. Using simulation, we rst demonstrate that network caches o er several potential performance bene ts, but that the adverse impact of cache inclusion-related evictions must be mitigated for these bene ts to be fully realized. We then evaluate three network cache architectural alternatives design...
متن کاملThe Performance Value of Shared Network Caches in Clustered Multiprocessor Workstations
This paper evaluates the bene t of adding a shared cache to the network interface as a means of improving the performance of networked workstations con gured as a distributed shared memory multi processor A cache on the network interface shared by all processors on each cluster o ers the potential bene ts of retaining evicted processor cache lines providing implicit prefetching when network cac...
متن کاملOperating System Design Principles for Scalable Shared Memory Multiprocessors
We describe SALSA an operating system that in corporates techniques for achieving scalability in large scale shared memory NUMA multiprocessors We evaluate the e ects of cache organization and caching policy on latency hiding via rapid thread switching With write back set associative caches we demon strate signi cant improvements in program perfor mance with latency hiding when cache miss laten...
متن کاملReducing Response Time with Preheated Caches
CPU performance is increasingly limited by thermal dissipation, and soon aggressive power management will be beneficial for performance. Especially, temporarily idle parts of the chip (including the caches) should be powergated in order to reduce leakage power. Current CPUs already lose their cache state whenever the CPU is idle for extended periods of time, which causes a performance loss when...
متن کاملSoftware Caching on Cache-Coherent Multiprocessors
Programmers have always been concerned with data distribution and remote memory access costs on shared-memory multiprocessors that lack coherent caches, like the BBN Butterry. Recently memory latency has become an important issue on cache-coherent multiprocessors, where dramatic improvements in microprocessor performance have increased the relative cost of cache misses and coherency transaction...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1994